In-Memory Performance for Big Data

نویسندگان

Goetz Graefe

Haris Volos

Hideaki Kimura

Harumi A. Kuno

Joseph Tucek

Mark Lillibridge

Alistair C. Veitch

چکیده

When a working set fits into memory, the overhead imposed by the buffer pool renders traditional databases noncompetitive with in-memory designs that sacrifice the benefits of a buffer pool. However, despite the large memory available with modern hardware, data skew, shifting workloads, and complex mixed workloads make it difficult to guarantee that a working set will fit in memory. Hence, some recent work has focused on enabling in-memory databases to protect performance when the working data set almost fits in memory. Contrary to those prior efforts, we enable buffer pool designs to match in-memory performance while supporting the “big data” workloads that continue to require secondary storage, thus providing the best of both worlds. We introduce here a novel buffer pool design that adapts pointer swizzling for references between system objects (as opposed to application objects), and uses it to practically eliminate buffer pool overheads for memoryresident data. Our implementation and experimental evaluation demonstrate that we achieve graceful performance degradation when the working set grows to exceed the buffer pool size, and graceful improvement when the working set shrinks towards and below the memory and buffer pool sizes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

The development of new technologies has confronted the entire domain of science and industry with issues of big data's scalability as well as its integration with the purpose of forecasting analytics in its life cycle. In predictive analytics, the forecast of near-future and recent past - or in other words, the now-casting - is the continuous study of real-time events and constantly updated whe...

متن کامل

Correlation of Big Data with Supply Chain Health Performance in Employees of the Tehran Intelligent Fuel System

Introduction: The dramatic growth of big data and its application in preventing waste of resources and increasing financial performance and supply chain health levels, need to be examined from different perspectives. This study aimed to determine the correlation between big data and supply chain health performance in employees of Tehran Intelligent Fuel System. Methods: In this descriptive cor...

متن کامل

Characterizing the Performance of Big Memory on Blue Gene Linux

Using Linux for high-performance applications on the compute nodes of IBM Blue Gene/P is challenging because of TLB misses and difficulties with programming the network DMA engine. We present a design and implementation of “big memory”—an alternative, transparent memory space for computational processes, which addresses these difficulties. The big memory uses extremely large memory pages availa...

متن کامل

Performance and Scalability Evaluation of 'Big Memory' on Blue Gene Linux

We address memory performance issues observed in Blue Gene Linux and discuss the design and implementation of “Big Memory”—an alternative, transparent memory space introduced to eliminate the memory performance issues. We evaluate the performance of Big Memory using custom memory benchmarks, NAS Parallel Benchmarks, and the Parallel Ocean Program, at a scale of up to 4096 nodes. We find that Bi...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

PVLDB

دوره 8 شماره

صفحات -

تاریخ انتشار 2014

In-Memory Performance for Big Data

نویسندگان

چکیده

منابع مشابه

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

Correlation of Big Data with Supply Chain Health Performance in Employees of the Tehran Intelligent Fuel System

Characterizing the Performance of Big Memory on Blue Gene Linux

Performance and Scalability Evaluation of 'Big Memory' on Blue Gene Linux

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

عنوان ژورنال:

اشتراک گذاری